Build initial portfolio type handlers for datamanager by forstmeier · Pull Request #633 · oscmcompany/fund

forstmeier · 2025-09-19T16:02:44Z

Overview

Changes

add basic "save" and "query" functionality for Portfolio type
update routes

Comments

This is going to look really similar to everything in prediction.rs and equity.rs but I'm going to cut a separate pull request and do some refactoring.

Summary by CodeRabbit

New Features
- Added portfolio API endpoints to submit and retrieve portfolio data.
- Portfolio submissions are stored as Parquet in cloud storage; retrieval by date reads Parquet from cloud.
- Responses include success details (row count and storage key); improved logging and error handling for portfolio operations.
Refactor
- Internal renaming and routing updates for prediction handlers; no changes to public endpoints, payloads, or behavior.

coderabbitai · 2025-09-19T16:03:07Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds a new portfolio routes module and wires it into the datamanager app router. Implements POST/GET /portfolio using Axum, in-memory Parquet -> S3 upload, and DuckDB httpfs reads from S3 with AWS credentials. Also renames two prediction handler functions and updates router registrations.

Changes

Cohort / File(s)	Summary
App router wiring `applications/datamanager/src/lib.rs`	Imports `routes::portfolio` and merges `portfolio::router()` into the main `Router<AppState>`.
Routes module index `applications/datamanager/src/routes/mod.rs`	Declares and exposes `pub mod portfolio;`.
Portfolio API implementation `applications/datamanager/src/routes/portfolio.rs`	New Axum module: defines Error enum, payloads, POST `/portfolio` (serialize Polars DataFrame to Parquet in-memory and upload to S3 with date-based key), GET `/portfolio` (load AWS creds, configure DuckDB httpfs, query Parquet on S3 and return DataFrame). Adds `pub fn router() -> Router<AppState>`.
Prediction handler renames `applications/datamanager/src/routes/prediction.rs`	Renames handlers `save_prediction` -> `save_predictions` and `query_prediction` -> `query_predictions`; updates router registrations and a log message.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Client
  participant App as Datamanager App Router
  participant PR as Portfolio Router
  participant H as Handlers
  participant S3 as Amazon S3
  participant AK as AWS Config
  participant DB as DuckDB (httpfs)

  rect rgba(230,245,255,0.6)
  note over C,App: POST /portfolio
  C->>App: POST /portfolio { DataFrame, timestamp }
  App->>PR: route to portfolio
  PR->>H: save_portfolio(payload)
  H->>H: DataFrame -> Parquet (in-memory)
  H->>S3: PUT parquet (key YYYY/MM/DD/...)
  S3-->>H: 200 + s3_key
  H-->>C: 200 { rows, s3_key }
  end

  rect rgba(240,255,230,0.6)
  note over C,App: GET /portfolio
  C->>App: GET /portfolio?date=...
  App->>PR: route to portfolio
  PR->>H: get_portfolio(params)
  H->>AK: load AWS creds
  AK-->>H: creds
  H->>DB: CONFIGURE httpfs with creds
  H->>DB: SELECT ... FROM s3://.../parquet
  DB-->>H: rows/DataFrame
  H-->>C: 200 { data }
  end

  rect rgba(255,240,230,0.6)
  note over H: On failure
  H-->>C: 500 { error }
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested labels

feature

Suggested reviewers

chrisaddy

Poem

I hop through routes with ears aloft,
Parquet crumbs and S3 so soft.
DuckDB hums a midnight tune,
Timestamps march beneath the moon.
New endpoints stitched — a rabbit's wink and scoff. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Build initial portfolio type handlers for datamanager" is a concise, single-sentence summary that directly reflects the primary change in the diff (adding portfolio handlers/routes and related types to the datamanager), so it accurately communicates the main intent to reviewers scanning history.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch 09-19-build_initial_portfolio_type_handlers_for_datamanager

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull Request Overview

This PR adds basic portfolio management functionality to the datamanager application by implementing save and query operations for portfolio data. The changes establish portfolio type handlers similar to the existing prediction functionality.

Adds portfolio data structure and handlers for saving/querying portfolio information
Updates prediction routes to use plural naming conventions for consistency
Integrates portfolio routes into the main application router

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
applications/datamanager/src/routes/prediction.rs	Renames functions to use plural forms for consistency
applications/datamanager/src/routes/portfolio.rs	Implements new portfolio handlers with save/query functionality
applications/datamanager/src/routes/mod.rs	Exports the new portfolio module
applications/datamanager/src/lib.rs	Integrates portfolio router into the main application

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

applications/datamanager/src/routes/prediction.rs (2)
78-86: Return 500 on upload failure (not 200).

Failure path currently responds 200 OK, masking errors to clients.

Apply:
-            (
-                StatusCode::OK,
-                format!("S3 upload failed: {}\n\n{}", err, json_output),
-            )
+            (
+                StatusCode::INTERNAL_SERVER_ERROR,
+                format!("S3 upload failed: {}\n\n{}", err, json_output),
+            )
154-161: Content-Type says JSON but body is not JSON.

DataFrame::to_string() is plain text. Either serialize to JSON or set text/plain.

Minimal fix:
-                .insert(header::CONTENT_TYPE, "application/json".parse().unwrap());
+                .insert(header::CONTENT_TYPE, "text/plain; charset=utf-8".parse().unwrap());
If you want JSON, switch to a proper JSON serializer for DataFrame and keep application/json.

🧹 Nitpick comments (7)

applications/datamanager/src/routes/prediction.rs (3)
146-151: Guard empty requests to avoid invalid SQL.

WHERE ticker IN () is invalid when positions is empty.

Add early return:
 async fn query_predictions(
     State(state): State<AppState>,
     Json(payload): Json<QueryPredictionsPayload>,
 ) -> impl IntoResponse {
     info!("Fetching predictions from S3");
+    if payload.positions.is_empty() {
+        let mut response = Response::new(Body::from("[]"));
+        response
+            .headers_mut()
+            .insert(header::CONTENT_TYPE, "application/json".parse().unwrap());
+        *response.status_mut() = StatusCode::OK;
+        return response;
+    }
233-238: Avoid naive string interpolation for tickers.

Directly embedding values risks SQL syntax issues and future injection bugs. Prefer a VALUES list joined on ticker, or at least escape single quotes in inputs.

131-132: Consider a more specific Parquet content type.

application/x-parquet is more precise than application/octet-stream.
-        .content_type("application/octet-stream")
+        .content_type("application/x-parquet")
applications/datamanager/src/routes/portfolio.rs (4)
97-101: Use tracing instead of println for structured logs.

Aligns with the rest of the module and preserves log metadata.
-                println!(
-                    "DataFrame successfully converted to parquet, size: {} bytes",
-                    buffer.len()
-                );
+                debug!(
+                    "DataFrame successfully converted to parquet, size: {} bytes",
+                    buffer.len()
+                );
131-147: Allow querying by timestamp (don’t hardcode Utc::now).

Using “now” will often miss data. Accept a query param (e.g., ?ts=...) or payload with date; default to latest only if not provided.

Happy to sketch the Axum extractor and parsing if you want it in this PR.

116-116: Parquet content type (optional).

Consider application/x-parquet instead of application/octet-stream.
-        .content_type("application/octet-stream")
+        .content_type("application/x-parquet")
75-129: DRY: Share S3 Parquet upload helper with predictions.

upload_dataframe_to_s3 is nearly identical across modules. Centralize to avoid drift.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 592b847 and 526a1f8.

📒 Files selected for processing (4)

applications/datamanager/src/lib.rs (2 hunks)
applications/datamanager/src/routes/mod.rs (1 hunks)
applications/datamanager/src/routes/portfolio.rs (1 hunks)
applications/datamanager/src/routes/prediction.rs (3 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

applications/datamanager/src/lib.rs (1)

applications/datamanager/src/routes/portfolio.rs (1)

router (231-235)

applications/datamanager/src/routes/portfolio.rs (1)

applications/datamanager/src/routes/prediction.rs (3)

upload_dataframe_to_s3 (90-144)

query_s3_parquet_data (173-280)

router (282-286)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Run Python code checks
GitHub Check: Run Rust code checks
GitHub Check: Run Python code checks
GitHub Check: Run Rust code checks

🔇 Additional comments (3)

applications/datamanager/src/routes/prediction.rs (1)

283-286: Router mapping rename looks good.

Handlers and routes align with the new names.

applications/datamanager/src/lib.rs (1)

74-75: Portfolio router successfully merged.

Composition order looks fine and state layering is correct.

applications/datamanager/src/routes/mod.rs (1)

3-3: Module export added correctly.

pub mod portfolio; exposes the new routes module as intended.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (3)

applications/datamanager/src/routes/portfolio.rs (3)
86-90: Path now matches write/read location — good fix.

S3 key uses equity/portfolios/... and aligns with the GET path. This resolves the earlier mismatch with predictions.

194-207: Valid DuckDB Parquet reference — good.

Using SELECT * FROM 's3://...parquet' fixes the invalid FROM clause pattern flagged earlier.

222-231: Don’t drop row mapping errors; collect and propagate.

filter_map(Result::ok) silently discards failures. Collect into Result<Vec<_>, _> and bubble errors; also avoid cloning Strings when building the DataFrame.

Apply:
-    let portfolios: Vec<Portfolio> = portfolios_iterator.filter_map(Result::ok).collect();
-
-    df!(
-        "ticker" => portfolios.iter().map(|p| p.ticker.clone()).collect::<Vec<_>>(),
-        "timestamp" => portfolios.iter().map(|p| p.timestamp).collect::<Vec<_>>(),
-        "side" => portfolios.iter().map(|p| p.side.clone()).collect::<Vec<_>>(),
-        "dollar_amount" => portfolios.iter().map(|p| p.dollar_amount).collect::<Vec<_>>(),
-    )
+    let portfolios: Vec<Portfolio> = portfolios_iterator
+        .collect::<Result<Vec<_>, _>>()
+        .map_err(|e| Error::Other(format!("Failed to collect portfolio rows: {}", e)))?;
+
+    df!(
+        "ticker" => portfolios.iter().map(|p| p.ticker.as_str()).collect::<Vec<_>>(),
+        "timestamp" => portfolios.iter().map(|p| p.timestamp).collect::<Vec<_>>(),
+        "side" => portfolios.iter().map(|p| p.side.as_str()).collect::<Vec<_>>(),
+        "dollar_amount" => portfolios.iter().map(|p| p.dollar_amount).collect::<Vec<_>>(),
+    )

🧹 Nitpick comments (7)

applications/datamanager/src/routes/portfolio.rs (7)

63-71: Avoid echoing entire DataFrame in 500 responses.

Large payloads in error messages can bloat responses and leak data. Return shape only and log at error level.

Apply:

-        Err(err) => {
-            info!("Failed to upload to S3: {}", err);
-            let json_output = portfolio.to_string();
-
-            (
-                StatusCode::INTERNAL_SERVER_ERROR,
-                format!("S3 upload failed: {}\n\n{}", err, json_output),
-            )
-        }
+        Err(err) => {
+            error!("Failed to upload portfolio to S3: {}", err);
+            (
+                StatusCode::INTERNAL_SERVER_ERROR,
+                format!(
+                    "S3 upload failed: {} (rows={}, cols={})",
+                    err,
+                    portfolio.height(),
+                    portfolio.width()
+                ),
+            )
+        }

Also update tracing import:

- use tracing::{debug, info};
+ use tracing::{debug, info, error};

139-145: Use error! for failure path.

Log GET failures at error level.

Apply:

-        Err(err) => {
-            info!("Failed to query S3 parquet data: {}", err);
+        Err(err) => {
+            error!("Failed to query S3 parquet data: {}", err);
             (
                 StatusCode::INTERNAL_SERVER_ERROR,
                 format!("Failed to fetch portfolio: {}", err),
             )
         }

And ensure error is imported:

- use tracing::{debug, info};
+ use tracing::{debug, info, error};

97-101: Use tracing instead of println!.

Prefer debug! to keep logs structured and adjustable by level.

Apply:

-                println!(
-                    "DataFrame successfully converted to parquet, size: {} bytes",
-                    buffer.len()
-                );
+                debug!(
+                    "DataFrame successfully converted to parquet, size: {} bytes",
+                    buffer.len()
+                );

116-116: Set a Parquet-specific content type.

application/vnd.apache.parquet is clearer than octet-stream.

Apply:

-        .content_type("application/octet-stream")
+        .content_type("application/vnd.apache.parquet")

131-139: GET should accept a date; Utc::now() rarely matches uploaded partition.

Make timestamp optional via query string and default to now.

Apply:

-async fn get_portfolio(State(state): State<AppState>) -> impl IntoResponse {
+async fn get_portfolio(
+    State(state): State<AppState>,
+    Query(params): Query<PortfolioQuery>,
+) -> impl IntoResponse {
     info!("Fetching portfolio from S3");
 
-    match query_s3_parquet_data(&state, &Utc::now()).await {
+    let ts = params.timestamp.unwrap_or_else(Utc::now);
+    match query_s3_parquet_data(&state, &ts).await {
         Ok(df) => {
             let json_output = df.to_string();
             (StatusCode::OK, json_output)
         }
         Err(err) => {

Add the helper type and import (outside this hunk):

use axum::extract::Query;

#[derive(Deserialize)]
struct PortfolioQuery {
    #[serde(default)]
    timestamp: Option<DateTime<Utc>>,
}

75-129: Shared helper duplication with prediction.rs.

upload_dataframe_to_s3 mirrors prediction.rs; consider extracting a reusable util (e.g., s3::parquet::upload_dataframe(state, df, key_prefix)).

I can sketch a small crate-local module with generic key builders for equity/{predictions|portfolios}/daily/... and a single upload function.

18-18: Import error to match new error-level logs.

Apply:

- use tracing::{debug, info};
+ use tracing::{debug, info, error};

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 526a1f8 and 53b29e8.

📒 Files selected for processing (1)

applications/datamanager/src/routes/portfolio.rs (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

applications/datamanager/src/routes/portfolio.rs (1)

applications/datamanager/src/routes/prediction.rs (2)

upload_dataframe_to_s3 (90-144)

query_s3_parquet_data (173-280)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Run Rust code checks
GitHub Check: Run Python code checks
GitHub Check: Run Rust code checks
GitHub Check: Run Python code checks

🔇 Additional comments (1)

applications/datamanager/src/routes/portfolio.rs (1)

30-34: Confirm serde support for DataFrame payloads.
applications/datamanager/Cargo.toml enables polars' "serde" feature (polars = { version = "0.50.0", features = [..., "serde", ...] }); chrono also has serde enabled — the SavePortfolioPayload serde usage is covered.

Build initial portfolio type handlers for datamanager

526a1f8

forstmeier added this to the Refactor milestone Sep 19, 2025

forstmeier requested review from chrisaddy and Copilot September 19, 2025 16:02

forstmeier self-assigned this Sep 19, 2025

forstmeier added application labels Sep 19, 2025

Copilot AI reviewed Sep 19, 2025

View reviewed changes

Comment thread applications/datamanager/src/routes/portfolio.rs Outdated

Comment thread applications/datamanager/src/routes/portfolio.rs Outdated

coderabbitai Bot reviewed Sep 19, 2025

View reviewed changes

Comment thread applications/datamanager/src/routes/portfolio.rs

Comment thread applications/datamanager/src/routes/portfolio.rs Outdated

Comment thread applications/datamanager/src/routes/portfolio.rs Outdated

forstmeier added 2 commits September 19, 2025 13:44

Build initial portfolio type handlers for datamanager

4d57538

Fix merge conflicts

53b29e8

coderabbitai Bot reviewed Sep 19, 2025

View reviewed changes

forstmeier merged commit 7d0be02 into master Sep 19, 2025
7 checks passed

forstmeier deleted the 09-19-build_initial_portfolio_type_handlers_for_datamanager branch September 19, 2025 19:00

coderabbitai Bot mentioned this pull request Sep 23, 2025

Refactor datamanager module #635

Merged

coderabbitai Bot mentioned this pull request Jan 21, 2026

portfolio fixes #669

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build initial portfolio type handlers for datamanager#633

Build initial portfolio type handlers for datamanager#633
forstmeier merged 3 commits intomasterfrom
09-19-build_initial_portfolio_type_handlers_for_datamanager

forstmeier commented Sep 19, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Sep 19, 2025 •

edited

Loading

Other AI code review bot(s) detected

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

forstmeier commented Sep 19, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Changes

Comments

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

forstmeier commented Sep 19, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Sep 19, 2025 •

edited

Loading